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ABSTRACT 



In a signal pattern recognition apparatus, a plurality of 
feature transfoimaticm sections respectively transform an 
inputted signal pattern into vectors in a i^urality of feature 
spaces coiresponding respectively to i^'edetcnnined classes 
using a predetemiined txansfocmation parameter cocre* 
sponding to each of the classes so as to emphasize a feature 
of each of the classes, and a plurality of discriminant 
function sections respectively calculates a value of a dis- 
criminant functioQ using a predetermined rf«rrimiT*"t func- 
tion representing a similarity measure of each of fte classes 
for the transformed vectors in the plurality of feature spaces. 
Then, a selection section executes a signal pattern recogni- 
tion process by selecting a dass to which the inputted signal 
pattern belongs based on the calculated values of a plurality 
of discriminant functions corresponding respectively to the 
classes, and a training control section trains and sets a 
plurality of transformation parameters of the feature trans- 
formation process and a plurality of discriminant functions 
so that an error probability of the signal pattern recognition 
is minimized based on a predetermined training signal 
pattern. 

16 Clahiis, 7 Drawfaig Sheets 
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SIGNAL PATTERN RECOGNTFION classes have been ImowiL and rccogoiziag new unknowD 

APPARATUS COMPRISING PARAMETER svxqiLes which ore not always included in the set of the 

TRAINING CONTROLLER FOR TRAINING training samples. 

FEATURE CONVERSION PARAMETERS Buthenoore, there has btea conventionally provided only 

AND DISCRIMINANT FUNCTIONS i one feature extraction section 1, which transforms an ii^Hit- 

ted signal pattern into a pred^crmined feature space, for a 

BACKGROUND OF THE INVENTION plurality ctf K disoiminant function calculators 3-1 through 

1. Held of the Invention f^^^ * manner as shown in HG. 4. to other words, one 

feature space is given commonly to all the classes. It is to be 

The iffcscnt invention relates to a signal pattan recogni- herein noted that the fcamrc transformation pcrfoixned by 

tion apparatus and a method for recognizing a signal pattern, feamre extraction section 1 is set a priori independently 

and mjamcular, to a signal pattern recognition ^jparatus for of the setting of the discriminant functions in the classifi- 

recognizing a signal pattern, which can be expressed by cation section 2, and Ac discriminant functians arc set by the 

numaical values and belongs to a class of information, such classification section 2 afier the feature ^cc obtained by 

as a speech signal pattern, a character signal pattern, an the feature extraction section 1 is given thereto, 

image signal pattern or toe like, said signal pattern recog- „^ ^ ^ rccogmtioo, a variety of 

nition qjparatus conqmsing a parameter traimng controUa- exccUem feaDue extraction methods have been proposed 

for ttamiDg feature convenion paranicters and discriminant ^nd put Into practice. A statistical linear feature extraction 

functioi^. and a method for recopimng a signal patton, ,^«ented by Kartiuncn-Lofeve (KL) transforma- 
said mctoodmcludmg a step of trammgfeamrc converse ^ tion and iSltipledisaLiimmt analysis has been extensive^^ 

parameters and disoiimnant functions. „^ ^ character and image recognition 

2. Desaiption of the Related Art because of its rigorous mathematicai ground and simplicity 
Basically, a signal pattern recognition such as a speech of calculation (See, for example. ErkU Oja. **Sub$pace 

reoognitioo, a character recognition* an image recognition or Methods of Pattern Recognition**, translated by Hldemitsu 
the lite can be conqvehended as a matter d dasstiying a 22 Ogawa and Makoto Sato, Sangyo Tosho. 1986 (refeired to 
signal pattern which is a quantity obtained by numerically as a Reference Document 1 hereinafter), or Jun-ichi 
observing an object to be recognized such as a speech signal Toriwakl 'Tattern Recognition Engineering", compiled by 
a character signal, an image signal or the like into one of The Institute of Television Engineers of Japan, CORONA 
predetermined classes so as to establish a correspondence PUBLISHING CO.. LTD., 1993 (referred to as a Reference 
ther^etween. The signal pattern recognitioD is basically 30 Document 2 hereinafter). In particular, as a speech feature 
composed of a feature extraction section 1 and a classifica- extraction method in toe field of speech recognition, toere 
tion se<tion 2 as shown in HG. 2. In this case, toe feature are known a short-time power spectrum bas^ on Fourier 
extraction section 1 transforms an inputted signal pattern transfarmation, a linear prediction analysis (LPC) metood, 
such as a ^ech, a character, or an image containing a and a ccpstnimmetood (See, for example, U Rabincr and B. 
high-dimension observed value including information 33 H. luang, **I^ndamentals of Speech Recognition*', Prentice- 
unnecessary for toe recognition, into a feature value, or a Hall International Inc., 1993 (referred to as a Reference 
low-dimension infcHination rqiresenting a class identity Document 3 hereinafter). The above-mentioned mctoods 
required for the signal pattern recognitioiL Thereafter, toe can efficiently express toe linguistic feature of a speech 
classification section 2 classifies toe feature value obtained required for class classification in a relatively low 
tiirougb toe transfOTmation into a predetermined dass so as 40 dimension, and toercfore, each of toe methods is often used 
to establish a correspondence toer^tween, and ou^Hits a as a feature extraction metood in toe current speech recog- 
classification result nition apparatuses. 

The classification section 2 fijndamentally comprises a However, in an actual environment when using a signal 

plurality of K discriminant function calculating 3-1 through pattern recognition system, toe above-mentioned excellent 

3-K, and a selector 4 as shown in FIG. 3. By previously 45 feature values change due to a variety of factors even in an 

determining a membersh^ of an inputted feature value to identical class. For instance, in the case of speech 

each class or a ^'discriminant function** representing a simi- recognition, for example, a difference of speaker and a 

larity measure of each class, and making the inputted signal difference of speaking style such as a dififcrence of utterance 

pattern correspond to a dass In which toe value of toe speed and a difference of coaiticulation, and a difference of 

discriminant function is maximized or minimi the clas- so acoustic envnonment such as background noise can be 

sifying process is executed. In otoer words, a feature vector enummted as inqKntant variation factors. The above- 

X of the inputted signal pattern is inputted to eadi of toe mentioned fact causes deterioration of a capability in 

discriminant fiinction calculators 3-1 throu£^ 3-K having speaker independence and reduction of noise resistance of 

respective predetermined discriminant functions, and toe toe speech recognition q^>aratus. In a practical situation, 

discriminant function calculators 3-1 throu^ 3-K Tcspec- S5 training of toe speech recognition ^aratus for making toe 

tively calculate discriminant function values by means of toe apparatus have a good recognition capability for unknown 

req}ective predetermined disaiminant functions, and output sanqsles different from toe training samples used for toe 

toe resulting values to toe selector 4. The sdector 4 sdects training of toe speech recognitioa apparatus is intrinsically 

a (Hedetermined maximum or niinimiiin discriminant fimo- important Id order to achieve the above-mentioned puipose, 

tion value among a plurality of K disoiminant function 60 it is Important to analyze toe statistical variation of toe 

values, and toen, outputs the information of the dass des> feature value in each dass. Conventionally, as a basic 

ignated by toe discriminant function calailator which has technique for expressittg a variation of a feature value* toere 

oo^utted the selected discxiffiinant function value as a has been often used a so-called multiple template technique 

classification result Thus, the signal pattern recognition has for assigoing a plurality of ten^lates which are reference 

been conventiooally petfoimed by trainiiig toe discriminant «5 vectors to respective classes. The above-mentioned tecfa- 

functions of toe respective discrimittant function calculatm nique inchides a metood for achieving classification by a 

3-1 through 3-Kby means of a set of training samples whose distance froma plurality of tenqilates sucli as learning vector 
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quantizatioD (LVQ) (Sec, for cxanq^lc S. KalagmL C. H. provided a signal pattcni recogoidoii apparatus far classi- 

Lee, BDd B. H. Juang, ^^New discriminative training algo- lying an inputted signal pattern into one of a plurality of 

ritfams based on the generalized probabilistic descent predetennined classes so as to cecogoize the inputted signal 

method**. Proceedings of 1991 IEEE Workshop on Neural pattern, comprising: 

N^Grfcs fen" Signal Processing, 299-308, Mnston, N. J. 5 a plurality of feature traosf(»nation means for respec- 
in U.S JV., in Scptemba, 1991 (refcned to as a Refcrenoc tively transfonning the inpuucd signal pattern into vectos in 
Document 4 hereinafter)) and a method based on the Hidden a plurality of feature spaces coiresponding respectively to 
Markov Model (HMM) according to a mixture Gaussian said classes by executing a feature transformation process by 
distribution (See, for exanqile, the Reference Document 3, means of a predetermined cransfonnation parameter c(sre- 
or X. D. Huang, Y Ariki. and M. A. Jack, "Hidden Markov lo sponding to each of said classes so as to eoophasize a feature 
Models for Speech Recognition**, Bdinbui^ University of each oi said classes, said feature transformation means 
Press, Edifibuigh, 1990 (refened to as a Reference Docu- being provided respectively for said {duraUty of classes; 
ment 5 haeinaftcr)). According to the above-mentioned aplurality of discrinunant function means for respectively 
multiple template techniques, when the number of templates rjiimijifing a value of a discriminant fiinctios by means of a 
is increased for the purpose of smaxtly e^essing the is predetennined discriminant function representing a similar- 
variation, the nonabcr of parameters to be simultaneously jfy measure of each of said dasses for said vectors in said 
adjusted increases. Therefore, the parameters are exoes- plurality d feature spaces whidi are transformed by said 
sivdy adjusted by a finite nuniber of training sanqsles plurality of feature transfcmation means, said discriminant 
lesuhing in an over-training, thereby causing such a {soblem function means being provided respectively for said plural- 
Ihat an accuracy in recognizing unknown samples deterio- 20 ity of classes; 

rates. For the above-mentioned reasons. Acre is often selection means for executing a signal pattern recognition 

adopted a means for preventing the deterioration of the pixxcss by selecting a class to which the inputted signal 

accuracy in recognizing the unknown samples by reducing pattern belongs based on the values of said plurality of 

the number of parameters as far as possible, dicreby sup- discriminant functions corresponding respectively to said 

pressing tlie accuracy in recognizing tiic training san^les. 25 classes, said discriminant functions being obtained through 

However, die optimization in number of the parameters is a calculation executed by said plurality of discriminant 

difficult problem, and therefore, the opdmization therein is fiuKtion means; and 

normally pcrfonncd heuristicalJy. training contr<d means for training and setting said plu- 

The conventional signal pattern recognition apparatus rality of transformation parameters of said feature tiansfor- 

shown in FIG. 4 crcq3loys the feature extraction section 1 30 j^qq process and said plurality of discriminant functions, 

whidi is empirically trained ind^>cndcntly of die dassifi* 3^ ^ probability of said signal pattern recognition 

cation section 2 as well as a discriminant function given a minimiy^ based on a predetermined training signal 

priori. The thus obtained signal pattern recognition result is pattern. 

such that neidicr the feature extraction nor the discriminant ^ above-mentioned signal pattern recognition 

function is consistent with the original purpose oc goal of the 35 apparatus^ each of said plurality erf feature transformation 

signal pattern recognition of minimizin g die recognition preferably linearly transforms the inputted signal 

CTTOT. Therefore, it is not guaranteed to ensure the c^timal pattern into vectors in said pluraUty of feature spaces 

state in terms of die recognition accuracy, resulting in a corresponding respectively to said classes by projecting the 

problem that the recogniUon accuracy is relatively low. inputted signal pattern onto apredetermined basis veaor and 

Furthermore, it is originaUy desired for die signal pattern 40 ,^^1^1^^ ^ resulting vector by a predetermined real 

recognition ^jparatus to cOTecUy recognize any unknown nuniber 

inputted signal pattern not used in the training stage. ^ above-mentioned signal pattern recogniUon 

However, an inputted signal pattern %*iudi can be cwrectly ^^^^^ ^ ^f said pkrality of discriminant functions of 

recognized by die signal pattern recognition usmg the fea- Sediscriminant function means is preferably a predetcr- 

ture exnraction and the discrmunant function given a priori 45 quadric discriminant function rraresenting die simi- 

is diew^caUy only an inputted signal patton used to a larity liasure of each of said dasses. 

training stage Sincemej^^ ^ above-mentioned signal pattern recognition 

me empmcal or a priori fa^ture training and die inelnc control means 

training is not consUtent with die recognition results, diere TsS^Tl^^J^on pam^s of 

has been such a problem diat a rattond measure conoornuig 50 ^^K^^^^^ and said pluraUty of 

SiT^^ Sfsc^tf^^^ 

^aj^wru,OTcoacre»iyttjnwn™uauii«^ SO diat die oTGrprobabiliiy of said signal pattern recogmtton 
naroiy taxen. minimized, based on said predetermined training signal 
SUMMARY OF THE INVENTION pattern, by means of an adaptive minimization method 
An essential objea of die present invention is dierefore to utilizing a probabilistic descent tiieorcm. 
provide a signal patten recognldon apparatus capable of The signal pattern reoognitxoa apparatus preferably fur- 
recognizing an unknown inputted dgnal pattern which is not ther comprises: 

used in the training stage, with a signal pattern recognition signal conversion means for converting an inputted 

accuracy greater than that of the conventional ajf^saratus. ^ speech into a speech signal and oolputting the speech signal; 

Another object of the present invention is to provide a and 

method for recognizing a signal pattern, capable of recog> feature extraction means for converting the speech signal 

nixing an uuknown inputted signal pattern which is not used outputted fircm said signal conversion means into a prede- 

in the training stage, with a signal pattern recognition termined speech feanire parameter, and outputting the 

accuracy greater than that of the conventional luirthod. 65 obtained feature parameter as a signal pattern to said plu- 

Id order to achieve die above-mentioned objective, rality of feanire transformation means and said training 

acoordiDg to one aspect of the present invention, diere is control means, thereby recognizing die inputted speedL 
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In the above-meotioned signal pattern recogoitioa Thei^orc, the present Invention provides anew af^aratus 

apparatus, said feature extraction means transforms Ifae and method for training of the signal patten recognition 

speech signal oo^tted from said signal convcrsioa means ^>paratus having a hi^cr accuracy in recognizing an 

into LPC oepstrom coefficient vectors through linear pre- unknown signal pattern different from the training signal 

diction analysis, and outputting resulting vectors as a signal S pattern ci the signal pattern recognition i^^iaratus. With the 

pattern to said plurality of feature transformatioQ means and above-mentioned apparatus and method of the present 

said training control means. invention, the training can be pofcxmed so that a feature 

The above-mentioned signal pattern recognition appsm- metric space for effectively representing the class Identity, 

tus preferably further comprises: namdy, the metric of tiie discriminant function inherent in 

image conv<xsion means for converting a character into *° cadi dass is achieved, and the recognition cirar is reduced, 

dot image data, and outputting the dot image data as a signal Therefore, in contrast to the convCTtional practice that a 

pattern to said plurality of feature transfonnatioD means and siniilarity cvahiation has been perfdnned in a common 

said training control means, thereby recognizing the char- feature space tn every class, an evaluation is pcrfonned in a 

0^ space inheiem in each class for representing the feanire of 

Tlie above-mentioned dgnalpatteni recognition appara^ thedassin toe present invention. With the 

tus prefaably fi^o^coiSes: arrang«nent tiie variation factor can be si^scd in the 

I- 7 wi « vxaii|aio«». ^ unJmowD Signal pattern, and die recognition 

farther miage conversion means for converting an image capabiUty is improved to allow a recognition accuracy 

into dot image daU, and outputting the dot image data as a higher than that of the conventional apparatus to be 

signal pattern to said plurality of feature transformation 20 obtained, 
means and said training control means, thereby recognizing 

the image. BRIEF DESCRIFnON OF THE DRAWINGS 

According to another aq>ect of the mesent invention, ^ ^ ^ . ^ ^ ^ 

there is piovided a method for classifying an inputted signal . ^f. ?^ objects and feanires of the present 

pattcmhitooaeofapluralityofpredeterminedcUssessoas 25 invention wiU b«»medei^ 

to recognize the inputted signal pattern, induding the fol- conjunction with the prefcned embodiments thereof 

lowing steps of *™ reference to the accompanying drawings throughout 

transfanningthc inputted agnal pattern into vccton iti a 3 taS*" by like lef «nce non^als. 

plurality of feature spaces conesponding respectively to said , • ■ j« _x . 

classes by executing a feature transformation raoccss by 30 FIG. 1 is a block diagram of a signal patton rccogmUon 

means of a predetermined transfonnation parameter coerce apparatus according to a prcfcircd embodunent of the 

sponding to each of said classes so as to emphasize a feature V^^^^ invention: 

of each of said dasses; FIG. 2 is a block diagram of a conventional signal pattern 

calculating a value of a discriminant function by means of recognitiwi apparams; 

a predetermined disaiminant function rqsresenting a siroi- HG, 3 is a block diagram of a classification section 2 as 

larity measure of each of said classes for said vectors in said shown in FIG. 2; 

plurality of feature spaces which are obtained through said FIG. 4 is a conventional " gp^ai pattern leoognition appa- 

feature transformation process; ratus usitig a conventional cU^sifi cation section 2 shown in 

executing a signal pattern recognition process by selecting FIG. 3; 

a class to which the inputted signal pattern bdongs based on ^ FK}. 5A is an explanatcry view of examples of signal 

the calculated values of said plurality of discriminant fiinc- pattms in an <ffiginal space which arc handled in die signal 

tions contending respectively to said dasses; and pattern recognition apparatus of the preferred embodimeat 

training and setting die transformation parameter of said shown in FIG. 1; 

feature transformation process and each of said discriminant fig. 5B is an explanatory view of examples of signal 

functions, so that an error probability <rf said signal pattern patterns in a medic space of class #1 when the original space 

recognition is minimized based on a predetermined training shown in FIG. 5 A is transformed into two feature metric 

signal pattern. spaces #1 and #2 in the signal pattern recognition apparatus 

In the above-mentioned method, said transforming step of the preferred embodiment shown in FIG. 1; 

preferably indudes a step of lincariy transforming tile input- ^ fkj. sc is an explanatory view of examples of signal 

ted signal pattern into vectors in said plurality of feature patterns in a metric space of class #2 when Uicwiginal space 

^es corresponding reqjcctivdy to said classes by pro- shown in HG. 5A is transformed into two feature metric 

jecting the iiqwtted signal pattern onto a predetermined basis spaces #1 and #2 in the signal pattern recognition apparatus 

vector and nuilt^jlying resulting vectors by a predetermined of the preferred onbodiment shown in FIG- 1; 

> . . * ^ ^ ^ ^ ^ ^. . FIG- ^ is A schematic view showing an operation of 

In the above-mentioned rncthod, eadi of said discrimhiant converters IM dirough 10-K whidi are shown in 

functions 15 prefExably a Fsedetenmnedquadric discriminant FIG. 1' and 

function representing the similarity measure of each of said . ^ . . 

^j^^ FIG. 7 IS a flowchart showing a parameter traimng process 

i« #1,.. ~— 1* I ^ ^ • r executed by a parameter training controller 20 which is 

In the above-mentioned method, said trauung step paref- 60 ^^j. ^ ctq i 

erably indudes a stq> of pexfoiming adaptation of the 

tnmsfcrmationpmmi^of sudfe^ DETAILED DESCRIPnON OFTHE 

cess and said discriminant functions, so diat the orar vavimnmi^nn^^ 

probability of said signal pattern recognition is minhnized. PREFERRED EMBQDIMEXI^ 

based on the predetermined training signal pattern, by means 6S tefecred embodiments according to the {sesent invention 

of an adaptive minimization method utilizing a probabilistic will be described bdow with reference to the attached 

descent theorem. drawings. 
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(1) Stnictuic of Signal Pattern Recognition Apparatus each d which can be altered by the parameter training 

FIG. 1 shows a signal pattern recognition apparatus of the controller 20 and rqiresents a simQaiity measure for a 

present preferred embodiment used as a speech recognition predetermined class or a membershq> of the class, and 

appsnxus. The dgnal pattcra recognition a^^aratus of the operate to calculate a discriminant fuDction value g^^ 

present preferred embodiment has a training mode for 5 (ssi, 2, .... K) corresponding to the iiq>utted feature vectors 

training apparatus parameters based on a training si^ud so as to output the discriminant fiinction value gjy^ to 

pattern and a recognition nKxlc fcr executing a recognition selector 12. The selector 12 outputs a classification result 

process for an unknown signal pattern. information of a dass corresponding to the disaiminant 

Referring to HG. 1, the signal pattern recognition ^pa- function calculator (one of IM through U-K) which out- 

ratus of the Feared embodiiiient comprises a ma^onc ^ niinimum disaiminant function value among a 

200, a feanne extraction scc^on 201. a plunOity^K feature ^.^^ ^ disaiminaiil functioQ vahie g^,) inputted to 

converters 10-1 through lO-K, a phirality of K discriminant £^2^2^^ li 

hinctioo calculators H-l ^ough ll-K, a selector U. and a «^ 

parameter traimng controller 20. In particular the signd ^/^^ ,SnaSl set the feature^«n^rnution 

pattern recognition a^wratus of the present preferred naimng hkw w tVTzrJ"^ 

Snbodimeni has a paTof one feature ^onvetter and one " paramctas of the fcahire oonvaters 10-1 through lO-K and 

discriminant function calculator f^ respective ones of a the disoriminant Amctlon paiamrtcrs of the drsaumnant 

plurality of classes for poforming a signal pattern rocogni- fimction calculators 11-1 through U-K based on the signal 

tion dassiiication. Rirthci» the signal pattern recognition pattern x Iqwtted for die training so diat the dass feature is 

^^»ratus of the present (Heferred embodiment is character- recognized according to the feature of the oocresponding 

ized in that die signal pattern recognltioD apparatus tfiereof 20 dass, or the cnor probability of the signal pattern recogni- 

comprises the parameter training controller 20 which sets a tion is minimlrrri In other words, the above-mentioned 

feature transformation parameter of the feature converter parameters are subjected to an adaptation so that the signal 

and a disaiminant function parameter of the discriminant pattern sq>arability between classes can be increased After 

function calculator provided for respective classes by train- the above-mentioned training mode, the recognition mode is 

ing through ad^>tation so that the class feature is recognized 25 set, and an inputted speech is iiqwtted to the microphone 

according to the feature of the correspoDdlng dass, or an 200, thorfoy making the feature extraction section 201. the 

oTor probability of the signal patton recognition is mini- feature converters 10- 1 diroug^ 10-K. the disc riminant fiinc- 

mi2ed. tion calculators 11-1 through II-K. and the sdector 12 

The miaophone 200 coovots an inputted speech into an operate to execute the signal pattern recognition, 

audio analog signaL and outputs die audio analog signal to 30 It is preferred that the feature extraction section 201, a 

die feature extraction section 201. Then, die feature extrac- plurality of K feature convotCR 10-1 through 10-K. a 

tion section 201 converts die iiqiutted audio analog signal plurality of K discriminant function calculators 11-1 through 

into audio digital data by subjecting the inputted audio 11-K. the selector 12. and the parameta training controller 

analog signal to an analog to digital conversion process at a 20 be implcmeitfed by an dcctric digital compota such as 

predetermined saropUng frequency and in a predetermined 35 a micro computa or the like. 

number of quantization biu. Thcrcafta. the feature extrac- In other words* the present preferred embodiment dis- 
tion sectioQ 201 executes, for example, a linear predictioa closes a discriminant fiinction metric training aiethod for 
analysis (LPC) on the iiqnitted audio digital data obtained fonning a dass feature metric space important for the signal 
dirough the conversion, dicreby extracting vectors of feature pattern recognition. As a simplest metric training method, it 
paramctm such as 32-dimensional tFC cepstrum coeffi- 40 can be considered to perform fsindpal component analysis 
dents of the speech, and outputs die extracted vectors as an of a set of training sanies evay dass, assume that the 
inputted signal pattern x to die feature convertas 10-1 eigenvector (or die characteristic vector) representing an 
dirough 10-K and to die parameter training controller 20. axis giving a higher-ordCT prindpal component exhibits a 
The feature convcrtCTs 10-1 through 10-K have predeter- more strict correspondence to die variation factor in re^c- 
mincd feature traosfonnatioD parameters for performing a 4S tive dasses. and use an axis corresponding to the lower- 
feature transformation process, wherein die parameters can ocda prindpal coiiqx)neat as a co^dhiate axis for repre- 
be altered t>y the param^cr training controller 20 and sentlng the class feature. When a discriminant function is a 
coae^nd to die discriminant function calculatcss 11-1 quadric discriminant hmction, the discriminant function 
dirough ll'K. Each of die feamre converters execute a obtained by die present training method is substantially 
featiffe transformation process for en[9)hasizing the feature 50 equal to Gaussian discriminant function or Mahalanc^ 
of the dass correspondittg to the Inputted rignal pattern x to distance. The training method is effective as a means for 
transform die inputted signal pattern x into feature vectors y, smartly expressing a variation factor of a given set of the 
(s=U 2, .... K). and then ou^ts the feature vectcHS to die training sanqiles. However, die training in each class is 
ooirespoading one of the discrimlDant function calculators perfanned independendy of the odier classes, and an adjust- 
U-l dirou^ U-K. In the present case, die feature transfer- ss ment at an end portion of a sanqile distribution which may 
mation process is to map the inputted signal pattern x in an incur a recognition error. Le., around the class boundary is 
orthogonal projectioo manner on a baris vector wfaidi is a insufficient. For the above-mentioned reasons, there is no 
dass-feature axis rqa'esenting a feature of a oorre^nding guarantee for assuring obtainment of a metric for prcseoting 
dass C|, and then multiples the mapped signal pattern by a an optimum discriminatiozL Therefore, according to the 
predetermined real number, thereby expanding or contract- 60 preferred embodiment of the present invention, a method for 
ing the mapped signal pattern. With the above-mentioned introducing a training based on a discrimination result and 
operation, the inputted signal pattern x is transformed into discriminatively training a metric concemii^ each dass is 
the feature vectors y, so that the inputted signal patten X can utilized. Concretely, a Minimum Classification Eiror/ 
be evahiated in a feature metric ^oe rqvesenting die Generalized Probabilistic Descent mediod is utilized as a 
essential c hff identity. 6S training method. Herdnafler. the metric training method 
Further, the discriminant function calculators 11-1 according to the minimum error training is referred to as a 
tfarou^ U-K have predetennined discriminant functions. Discriminative Metric Design method. 
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(2) Overview of Statistical signal pattern leoognitioo and tern x and fee dassCf In particular, u^en the loss 1^ is a 0-1 

Role of Metric loss expressed by Che Equation (3). (he expected loss L(C(x)) 

(2*1) Signal pattern leoognition according to Bayes ded- coxresponds to the reoognitioa error probability, 

sion rule According to the statistical signal pattern recognition 

Id the present case, a theme for dassiiying a natural s based on Baycs decisloo rule, a signal pattern recognition 

number d-dimenslonal icputted signal patten x c R** into a apparatus which has a dcdsion rule C(x) where the expected 

plurality of K classes {C,}^i^ For instance, in the case of UC(x)) is minimized is considered to be most prefer- 

speech recognition, the claS C, ccnesponds to a linguistic cxaiiq)le, the Reference Document 2, or R. 0. 

category such as a phoneme, a wonl or the lite. In a ^' Hait "Pattern Oassification and Scene 

dumicter recognition apparatus, the class C, is eadi diar- lo ^^^^^ ,NewYoifc John Wflq^A^^^^ 

.^•r T« on tJr«i»« wm^^ZLt^^ .^.^t... *kl -.1— n ^ Rcfcrcnce Document 7 heremafter)). In mOTC detail, by 

^^t^ni^ZrS^t^^ ^^fi ^ C(x) whae *e expected loss 

S^^T^Jr^ pattenL The diataOa or the ^Q;,))t5 minimized by means of a set of tiainSgk^ 

^ge signal pattern is lepresenCed, for exan^le, in a foaa thc^ pattern recognition apparatus is tnS^Howcvff , 

of a dot image data. On the oftcr hand, the inputted signal sin<^iui^ ofSc trainin7sample3 are limited to a 

pattern x ccnesponds to a feature parameter such as a is finite value even if a lot of training samples are prepared, it 

short-dmc power spectrum and an LPC cocflBdcnt obtained is extremely difficult to pafcam a really optimum training erf 

by taking an objective portion out of, for exanq>le, a con- the signal pattern recognition ^jparatus. Therefore, a prac- 

tinuous speech signal waveform or extracting a feature d tical training proUcm of fte signal pattern recognition 

the waveform taken out. A recognition dcdsion rule C(x) is apparatus is a problem for obtaining a decision rule Qx) 

dinned as a m^ing from an inputted signal pattern space 20 where die expected loss L(C(x)) is minimized witii respect 

R*' to the dass spm {C,}^^'^ which is the feature metric to all the signal patterns as far as possible by means of a 

q>ace of the dass the vaap expressed by the following finite value of a training signal pattern {x„; n=l, 2 N}. 

Equation (1). A concrete training method of the signal pattern recognition 



q)paratus leased on the Bayes dedsion rule is divided 
25 broadly into two categories of die maximum a posteriori 



AsshowninFia3.thciecognitiondedsionruleC(x)has Probability decision method (^aycs meffaod) and the dis- 

been conventionally regarded as a process for dassifring a ccum^*^ function method as foUows. 

feature vahie on the assumption that x is a feature value (^-l-l) Maximum a posteriori probability dedsion 

wfaidi has already undergone a feature extraction process. method (P^ycs mrtbod) 

Howcva. in the present piefcircd embodiment, there are The «p<xted loss of d^^^^ 

considered: foUowing Equauon (5). 

(a) the above-mentioned generic case, and 

(b) a case where both a process for extracting the f eaUne f f ^ 1 
value from die iBputtod signal pattern x and a process for ^ '^^^^ J \ «OC»)yM») | pOOd>c 
dassiiying the feature value are induded on the assunpdon 

that the inputted signal pattan x is regarded an observed ^ ^ above-mentioned Equation (5). Pr<Cjklx) represents 

Yjiiii^ an a posteriori probability of the dass on condition that 

It is herein assumed that a loss in such a case where an ^ signal pattern x is given, and p(x) rqresents an ^Jpear- 

iiqwttcd signal pattern X belonging to the class is recog- ^ anoe prctoabUity density function of the signal pattern x. As 

nized is WQx)). Hie loss ^Qx)) satisfies the following is evident from tf»e above-mentioned Equation (5), it can be 

Equation (2). understood that, when the signal pattern x is given, a 

dedsion rule for making the signal pattern x coirespond to 

/ft(C(jr)H> for 0(jk)=Ca the class G(x) where the portion inside the braces { } is 

minimiyi>H minimizes L(C(x)). In more detail, die dedsion 

i^a«))»oferC(*)rt:» (2) 43 nileC(x) where the objected kss is minimized is given by 

The loss I4 (C(x)) is set so as to reflect a risk of an action *^ following Equation (6). 
based on a recognition result of the signal pattern recogni- 

tion apparatus. In the present pr efer r ed embodiment, die a«):*-^<iff««f»mia f (i(C)PKC*t«) 

purpose of training is to minimize the recognition aror j 

probability. Therefore, a loss in a case of a correct recog- - ^ ^^^^^ d^»^:^. /^rw 

Stion is determined to be zero, and a loss in a case ofL ^ above-mentioned Eqoauon (O* arg mm <«f th« nght 

i-y 4Artv. ouw u ui a «i au reoicsents ttie value of s when the value Dosltioned on 

T^il^l^T^:^"^^^^^ theriS^^ldedieieofis^inimlzedTliedl^^^ 

a 0-1 loss whidi is expressed by the foUowmg Equation (3). rf»v«.n,..«ri«n«i r^ns^r^n ffii u «f«i«i to « . 



ing to the above-mentioned Equation (6) is ref ened to as a 
iiiW>0 ior C{jO=Ci, Bayes dedsion rule (See, for example, die Reference Docu- 

ment 2 and die Reference Document 7). In particular, in the 
'a(C(x)>'1 for CT'VC** (3) case of the 0-1 loss given by die Equation (3). the Equation 

. . ^ (6) results in the following Equation (7). 

Lx>ss with Ttspcct to all the signal patterns, le., expected 

losses which are expected vahies of loss are given by die 60 <7) 

following Equation (4). OCi)mO if i = ofy masFKCJix) 



■IS 



In tiie above-mentioned Equation (7). the arg max of the 
U0|»)-^J ii«XK))p(^cod* right side represente devalue of 8 when die value positioned 

65 on the right sMethmof is maximized. In other words, there 
In the above-mentioned Equation (4), p(x, Cj^) rqneseots is achieved a signal pattern recognition ^iparatus in which 
a joint probability density between the inputted si^ial pat- the decision rule for assigning the iiQMitted signal pattern x 
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to a class Q to jHOviding the mOTimum a po^eriari 
probability has the rninhnurn enor probability. The above- 
mentioned nile is paiticularly lefecred to as the maximum a 
posteriori probabiLiiy dedsion lule. Accordiag to the Bayes 
decision nile, the decision rule according to the Equatioa (7) 
is equivalent to ttic following Equation (8). 



(8) 



10 



15 



20 



In the above-mentioned Equation (8), Pt(C^) represents an 
a priori probability of the dass and p(xlC^ rq>res«its a 
conditLOoal probability doisity function of the signal pattern 
X in the dass As described above, the Bayes decision rule 
is to perform classification by means of a discriminant 
function composed of the a priori probability of each class 
and the conditional probability density function of the signal 
pattern in re^>ective classes. 

The above-mentioned decision rule can achieve the rec- 
ognition with the miniirtum CTTOT probability only when ttie 
correct a priori probability and the correct conditional prob- 
ability density Action are given. However, it is practically 
difficult or almost impossible to obtain the true values. 
Thcrrforc, a means for estimatiiig from a limited number of 
training samples the probability density of them is utilized 
The above-mentioQed means is the principle of training of ^ 
the maximum a posteriori {Hobability dedsion method 
(Bayes method). Assuming that a training sample set whose 
dass has been known is given, tfie a priori probability Pr(C,) 
and the conditional probability density function p(x(C,) of 
each class C, are estimated based on a probability modd 
Pr(C P(xlC^ K) thereof. In die present case, 72 / 

are estimated parameters of the a priori probability modd 
and the conditional probability <knsity function modcU 
re^ectivdy. Eventually, according to the Bayes method, die 
signal pattern recognition apparatus is trained by prepara- 
torily designating functions of the a priori probabflity modd 
and the conditional probability density function modd 
which are probability models and estimating unknown 
parameters of the models by means of training samples* and 
then, recognition cf an unknown inputted signal pattern is 
executed aoccnding to the following Equation (9). 



30 



35 



40 



t 



(9) 



such a problem that a discrimination c^iability to the 
unknown samples deteriorates. The above-mcolioned phe- 
nomenoo is refexred to as an over-training. Therefore, as a 
means for reducing the number of parameters while setting 
the number of templates at an apprqniate value, there is 
often taken a heuristic method for designating a diagonal 
covariance matrix ot a method for applying an informadon 
criterion (See, for example, the Reference Document 8). 
However, die estimadon accuracy of the probability distri- 
bution extremely deteriorates, and therefore* the achieve- 
ment the minimum ecror leoognition becomes more 
difficult Hie other problem is an inoonsisteDcy of die 
estimation accuracy of the probability distribution widi die 
recognition eacr probability. Since die number of training 
sanqAes is limited to a finite value, the estimated probability 
distribution is almost alwi^s accompanied by an exrcv widi 
respect to the tiue distribittioa. It is considered that the 
enoneous recogoidon occurs sobstantially in the vicinity of 
the dass boundary. The Bayes method tends to faidifully 
represent the model in a pcrticn on ^^di the training 
samples conoentrate q)art firom the class boundary, and 
therefore, an accumulation of errors may possibly concen- 
trate on and around the class boundary. In odier words, it can 
not be said that an improvement of the estimation accuracy 
of the probability distribution fvovides a direct contribution 
to the minimizatioD of die error probability. Therefore, in die 
preferred embodiment d the present invention, die maxi- 
mum a posteriori probability dedsion method (Bayes 
method) is not used, and the following discriminant function 
method is used instead of this mediod. 

(2-1-2) Discriminant function mediod 

The discriminant function method is a method for training 
the signal pattern recognition a{^>aratus by setting a dis- 
criminant fimction whidi is a measure representing a mem- 
bership ci an inputted signal pattern to each dass and 
training the discriminant function so that a loss brought by 
the recognition result of training saiiq)l£s is minimized. In 
comparison widi the above>-mention^ Bayes method, die 
discriminant function method can be regarded as a training 
method intended directly to minimize the loss or the recog- 
nition emx^ probability. According to the discriminant func- 
tion method, the following Equations (10) and (1 1) are given 
as a decision rule. 



45 C(^).x-¥Q if I = c/j max«/x) 



For htting of each model a statistic estimation meUiod 
sudi as the matimum iikdihood method (See, for exan^ile, 
die Reference Document 7, and a joint work by Yoshlyuki 
Sakamoto, Makio Ishiguro. and Genshirou Kitagawa, 
"Informntion Statistics**, Lecture on information science A. 
5. 4. Kyouritsu Shuppan, 1983 (referred to as a Reference 
Document 8 hereinafter)) is generally ad<^jtcd. 

However, the above-mentioned classification method has 
two big problems. One problem is the fact diat dedsion of 
a function of an optinuim probability modd is difficult 
There can be considered a single Oaussiao distribution as the 
simplest model, however, the distritiution of an actual input- 
ted signal pattern is mere ccanplicated. Therefore a mixture 
Gaussian distribution for expressing a conqdicated distribu- 
tioo by giving a plurality of average vectors (Le.. so-called 
templates) to each class and combining a fdurality of Gaus- 
sian distributions is often used (See, for example, the 
Reference Document 5). However, when die number ci 
mixture is increased to express die cosiqilicated distribution, 
die number of parameters to be estimated at the same time 
is increased, and the modd represents the limited number of 
training samples excessively faithfully, resulting in incurring 



(10) 



(11) 



50 



53 



60 



65 



In other words, the decision rule C(x) is expressed by a 
fimcticm set g(x)={gXx)}^i^ In die present case, g^x) is 
refeired to as a discriminant function, and r^^sents die 
membership of the inputted signal pattern x to the class 
The above-mentioned Equation (10) is a dedsion rule for 
making the signal pattern x correspond to die dass repre- 
senting the maximum discriminant function value. As a 
discriminant function, a probatnlity model for expressing a 
similarity and sudi a case where two vector angles are 
adopted can be enumerated as examples. Gn the other hand, 
the above-mentioned Equation (11) is a decision rule for 
assigning the signal pattern x to the dass representing the 
miniimim value of die discriminant function of die signal 
pattern x. A case where a distance &om a class rderence 
vector is ai^slied as a discriminant function is an example of 
the above-mentioned rale. In odier words, in contrast to the 
Bayes mediod limited to die jaobability model, a function in 
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a wider range can be designated aoocmling to the discrimi- exan^le. a speaker-ittdq)cDdeat speecfa recognition and a 

nant function method. noisy speech recognitioa. According to eitho* the above- 

The training of die signal pattern recognition apparatus is mentioned Bayes method or the discriminaot function 

performed by training a discriminant function set g(x) such mcdiod, a popular means for preventing the deterioration of 

that the expected loss L(Oi)) is reduced as far as possible by 5 the accuracy io recognizing tlie unknown sample is to 

means of a s^ of training sanies {x„; n^l. 2, . . . , N) whose sacrifice the accuracy in recognizing the training sampIcL 

dasses have been known. It is to be noted that die functional However, such an architecture does not lead to a rational 

minimization problem can be hardly solved directly. achievement of the increase of the accuracy in lecpgnizing 

Therefm, the training is performed by nonnally setting a the unknown san^le for the reason as described hereinbe- 

sii^fiX^; 6) an<^ estimating the parameter e. According to lO foie. 

the training of the signal pattern recognition tqyparatus based For the pmpose of training the signal pattern recognition 
on the discriminant function method* f onnaiization of a loss apparatus independently oi the difference between samples 
function l«(x; 8)=lft(e((x; 8)) and a minimization method of witt high accuracy, it can be considered proper to learn 
die expected loss of the objective function are inqxniant some attributes for efficienily expressing the dass feature 
problems. A basic method the Minimum Classification is from the limited oumber of training samples. Here is con- 
Enor/Generalized Ftobabilistic Descent method is disdosed sidered the training of the feature metric space having small 
in. for example. B. H. Juang and S. Katagiri, "Discrimina- variations in all the dasses. Refening to FIG. 4 which shows 
tive learning for minimum eiror classification'*. IEEE TVans- a cooventionBl apparatus, a feature metric space is given 
action on Signal Ftocessing^Vd. 40, No. 12, pp. 3043-3054, commonly to all the dasses. However, it is considered 
in December. 1992 (refexred to as a Reference Document 6 20 difficult to obtain a common feature metric ^ce having 
hereinafter). The Minimum OassificatioDEiror/Generalized small variations in all the classes. In view of die above. 
Iftx)babilistic Descent method is a sigoal pattern recognition according to the present prcfciTcd embodiment a feature 
qiparatus training method which realizes a formalization of metric space fca* smartly c^qsrcssing the class identity is 
the dedsion rule Qx) and the loss function 6) accord- provided in each of the dasses in a manner as shown in FIG. 
ing to a snoooth first-order differentiable function, and 2S 1 in contrast to the conventional apparatus shown in FIG. 4 
enables a practical application of a minimizaticm method in which die similarity evaluation is performed in a feature 
sudi as a gradieat method. metric space common to all the classes. The above- 
Intuitively speaking, the training according to the dis- mentioned arrangement of the {sesent preferred embodi- 
oiroinant function method is performed by correcting a ment can be cooqnehended as a matter of training the 
discriminant function so that the possible occurrence of the 30 •'metric" of the discriminant function inherent in each of the 
recognition error is suppressed as far as possible. The classes sons to pc^orm the similarity evaluation in a feature 
above-mentioned architecture means that an increase of space inherent in each of the classes, 
accuracy at the class boundary where the erroneous recog- When the above-mentioned training can be achieved, any 
nition tends to occur. It can be understood that the discrimi- unessential variation can be supfH'essed by the signal pattern 
nant function method perfonns a training of an essential 35 recognition in evaluating tiie similarity between the 
portion for the increase of the recognition accuracy In unknown san^le and each class, and the similarity evalua- 
contrast to the Bayes mediod which is based on die fitting of tion can be pciformed in a feature mdnc space sudi ttiat it 
a stochastic model. Therefore, in the present prefeired effectively expresses the class identity in a manner as shown 
embodiment (tf the present invention, the training of the in FIG. 5. In other words, in the original space, the signal 
signal pattern recognition ^>paratus is perfoimed accarding 40 pattern of Class #i and the signal pattern of Qass #2 exist 
to the discriminant function method. randomly. In the feature metric space of Qass #1, the signal 
However, even according to the discriminant function pattern bdonging to Qass #1 is positioned in a location 
method, the problem of the reduction of the accuracy in within a predetermined range to allow the similarity evalu- 
recognizing an unknown sample different from the training ation of Qass #1 to be easily achieved. On the other haxul. 
sample is serious. For instance, when the number of tern- 45 in die feature metric space of Qass #2, die signal pattern 
plates which are dass rd'erence vectors is increased in order belonging to Qass #2 is positioned in a location within a 
to express the variation of signal patterns in a class accord- predetermined range to allow the similarity evaluation of 
ing to a classification m^od based on a distance measure Qass #2 to be easily achieved. A matiiematical definition 
rqiresented by a multi-template distance classifier (LVQ: and a training method of die **metric** will be described 
See. for exanq)le. die Reference Document 4), the problem 50 below, 
of over-training about the training sample occurs in a (3) IVaining of Metric 
manner similar to ttiat of the case of the mixture Gaussian (3-1) Formalization of Metric 

function according to die Bayes m^hod. Normally, die First of all, a linear dransfonnation 1^ from an original 
number of templates (or the degree of fi^edom of the signal pattern ^ce X to a feature metric space Y, of a dass 
parameter) is heuristically determined, however, such an 3S is defined by the following Equation (12). The linear 
architecture does not lead to a rational achievement of the transformation is executed in die feature convotos 10-1 
increase of the accuracy in recognizing the unknown sample through It-K shown in FIG. 1. 
diougb it contributes to reduction of a difference in recog- 
nition accivacy between the training sample and the )v=^i/jr)=o,v/j; 2.. . . ;r {13} 
unknown sample. 60 
(2-2) Role of discriminant function m«ric ^r^^^t ^.a - 

In a statistical signal patten lecognidoa. die increase of VA^^ • ■ v^l wtwe v/v/?f (14) 

the accuracy in recogniziag the unknown sample diffaent ' 

from the training san^le is an essentially inqxxrtant prob- In the above-mentioned Equations (12), (13) and (14), the 

lem. The above-mentioned matter is the most inqioatant 65 superscript T rqsreseuts a tian^sition of a matrix, die 

problem in an actual environment which tends to indude a diag(.) reiaesents a diagonal matrix, and I rqsresents an 

variation unnecessary for the reopgoidon in a case of, for identity matrix or a unit matrix. A role of the transformation 
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of the above-mcotioacd Equations can be easily understood a*=va*'/ 



from FIG. 6 in a case of two dimensions (d=2). Each column 
vector of an orthogonal matnA v, is an orthogonal base 



A^/ (17) 



vector of a d-dimoisional vector space. An component . ^ 

of the transformed vector is obtained by athogonally 5 m o • • • 

projecting an inputted signal pattern x in a direction of an &={dj, e^. . . . . (19) 
i-th basis vector v,^ and fiirtficr multiplying the resulting 

vector by a real number (^^^ to effect expansion or contrac- ®,=<^. v,} (20) 

tioo (i.e., wcichtinK). Uiough not shown in FIG. 6. , _ ,.^v mrt ^ ^ . . - ^ 

H«Snafter. Ae vtcS v., aid the rral number «e „ ^ |tf Ihe nght member of the 

lefored to « an i-th d^eature a«s and a wd^ting second equaUonrtpresents the square of an Euchdean nom 

Lunetcr in a« dass C, respectively. In the oOmple of a vectM/'deT represents a d««minant, and r, r^n^ 

SowTin no. «. the oti^ Vignal («ttan is « irfacnoe vector of the class The parameta 6 of the 

,«esented by two axes x7«S ^whidhae popendicolar pattern recognition appai^s conc«ung tte d^C^ 

^othe^whae the iiiotted signal pattaixWessed ,5 « the class reference vertorr, and the metnc(«...V,),j*^^ 

by OP is transfoRDed into a feature metric space after the e and 5 are previously detemuued "on-negattve consents 

Seartransft«iiationcomposedoftwobasisvidOBV,..8n<l Tl>c constote is a constant fa prwenttog the d^ 

v.^ which are papendicuSr to each other. Acompone« OQ *««^«! frombemg negative In other words, when eS 1 Jhe 

l,SrfcnnediJ5aeb.sevcctor V,. is «tpre«edas lv,/xl. disctuniDant function is always non-neganve. On die other 

Xei^po«ntORtr«rfoan.ainto^baseveciiV^ a hand, 5A, « a constralm t«™ for prrvendBg ^ 

k^mress^lu Iv *xl becoming a zero matrte in the training. When die 

^dass-featu^^axisandtheweightingparametercaiibe ay««8f ^^^^ and Invene of ttie cQv«iu.ee mat™ in toe 

SrSrity^aluatioit ITierefare. by making the axis becomes the Gaussian discnminamfunctioculnotherwonls. 

Sr«^ uiSf^ f« rSSal p.«en> the disaiminant function aaxrdUig to me fqua^" (15) « 

ignition correspond to the dass-feature axis containing "'f » ^'^^l^^^t^'^A^^^ti, 

the Weighting panmeter having the small vahie in eadi *«= /.bove-meationed quadnc discnminant 

dass. an evalL^onl^ stresfon the feature metric space 30 *"'^«^?fi«,'f " »fj]^*^"* P^*^* 

^«enting the esseiSiass Identity is adueved. lo S« ^-3)-ft.iningof Metncaccoriingtopruiapal componeirt 

w«ds. the similadly evaluation is petfcnued witUn the analysu ... _u j .v 

fcato;.netricspaceL«entineadid^.««ebyallowing As the su,,pl«t metnc trainmg mdho^^ 

ft^ppressionTdievariatfoBofsimil-rityev2«tiondue ^'"^•^^■"f*!^^"'^?^!'^ «12f^?.^^ 

. « K-*^,**« co*««ii.c conmoncnt analysis of a set of traimng sanies bclongmg to 

l^^T^^y^^^^^Ki^ ) -^^^^^anXtiealchslTl^eprindpalcon^enta^^^ 

a^^SsSc^mSf^inf'S"^ analysis method for seardung an ^ogon^^ 

^l^S^^ch of the spaces are rcf«red to as a *inetricf ^ ^ components fh>ma mulU^varuUe d^^ 

and the space Y, in wWdhL feature vcctcr y, moves is ^ (S^- for example, ^^/^f^^^^^^^f^^^^ 

referred tTTa We metric space of me class C/.TTie 40 1^^^^^ ^^^'^'^^.^^^ 

iniining method for the signal pait«n recognition apparatus <»>r«tion ^ the vanaUon factor m each class can be 

S method of the pfesenrprefc^ Embodiment also «]J«fed. First of alL a ^ "^^^^J'^J^ 

^aTAemetiicatthesinetimcVtfacr^ sample coyari^ matnx^^^ 

parameter of the classification section 2 adopted generally. the following Equations (21) and (22). 

In other words, according to the present preferred 45 

cmbodimcDl, both of the fcaUire converters 10-1 through l 

10- K and the discriminant function calculators 11-1 through ««i 

11- K are trained. ^f, (22) 
(3-2) Discriminant function to be handled ^=-J7" ^ (^*w.)(^-m.)' 

As a similarity measure in the feature metric space in each 30 ' '^^ 

dass. generally a variety of measures are ^licable. In other ^ above Equations (21) and (22). n}*^ represents 

words, the similarity measure selection and the metric training sanq>les belonging to the dass C,, while N, rcpre- 
training can be considered to be performed Independently of ^^^^ number of the training sauries. Further, the dass 
each other. As the simjHest measure having the W^est saiiq)le covariance matrix R, is subjected to eigen- 

gencrai-use, the Euclidean distance measure can be enumer- ^5 decomposition by means of the following Equation (23). 
ated. When seeing the Euclidean distance measure from the 

original signal pattern q>ace X, a discriminant function ^«=f«r^/ (^> 
corresponding to an s-th dass C, (s=l, 2, . . . , K) is given 

by a quadiic discriminant function expressed by the follow- r,=*fi«g(t,., • - • y*^) (**> 

ing Equation (15). 60 *^ . . . #^1 whew (23) 

^Xj^-O) = «(* ®*) The principal component axis can be given by the eigen- 

d ^ vector { t,^^x of the class sample covariance matrix An 

- |Lii(jf)-lii(r,)IP+ Z^io«(« + 64g) eigcn-value {Y**}^/ ccnresponding to each eigenvector is 

_j 65 equal to a sample variance of the entry obtained by orthogo- 

= (<-r,)rA/^-r«)4log(le<i/i-&4, ) projecting a sample in a direction of the eigcnvcctoi. 

where Therefore, the eigenvector cocieqwnding to the great cigen- 
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value of a dass san^le covaiiance matrix in each class can above-mentioned losses has such an advantageous effect that 

be regarded as the axis representing the dispersion of the gradient m^hod can be c^lied because the objective 

san9)les in the dass. In the present case, die metric (O^ V,) function is differentiable. However, the losses have no 

can be expressed by the following Equations (26) and (27). consistency with the minimization of the recognition error 

^ 5 probat^ty, and therefore, they are insoffident in tenns of an 

(26) optimum training of the signal pattern recognition apparatus, 

v^^^ ^27) ^ method for concmrently solving die above-mentioned 

two problems. Le., die unsmoothness of the objective func- 

According to the above-mentioned Equation (26). die tioo and the inconsistency with the mininuzation of the 
square root of an inverse number of the dgen-value of the 10 recognition eiror probability is the Minimum Classification 
covariance matrix is designated as the wdghting paranketer, Error/Generalized E^obabllistic Descent method. The Mini- 
resulting in regarding the eigenvector of the lower-order mum Classification Ezra/Generalised ftobabtlistic Descent 
principal com^oneat as the basis axis for rc|sesenting (he method can achieve die fanoalization of the smooth loss 
dass feature. As a result, when die dass sam(de mean is function having a consistency widi die minimization of the 
taken as a center vector r, and Che distance measure in the 15 reoognidon emr probability through two steps of f<»mal- 
feature meliic space is the Euclidean distance, the quadric ization as follows. Here Is now considered the classification 
discriminant function based on the principal conqxxient of an inputted signal pattern x belonging to a class C^ First 
analysis is substantially equal to the conventional Gaussian of alL a measure representing the carrectness or incorrect- 
discriminant function or the conventional Mahalanobis dis- ness of classification dedsion. Le., a misdassification mea- 
tance. In other words, the Gaussian discriminant fbncdon 20 sure 4fc (x;6) is defined. In the prcsent case, an Lp norm type 
can be regarded as the quadiic discriminant function in such measure as described bdow is adopted In more detail, in die 
a case whexe the metric of eadi ciass is obtained from a case of the maximum discriminant function decision rule of 
class-dependent statistical analysis of die training samples. the Equation (10), die misdassification measure d^ (x;e) is 

(3-4) Training of Discriminative metric based on Mini- defined by die following Equation (28). On the other hand 

mum enor recognition 25 in die case of the minimum discriminant function decision 

The prindpal component analysis method is a statistically rule, the miscbssification measure d^^ (x;d) is defined by the 

high accuracy method as an infomiation compression following Equation (29). 
method of an inputted signal pattern. However, training the 

parameters in each dass is porfoimed ind^Kaidcady of the ^ « ^ "T 
odier classes, and tbadore. the metric tends to be foidifully 
trained in die p(ations where die training samples are 
concentrated. Since an adjustment at an end pcxtion of the 

san^le distribution where substantially the enoneous rcc- ^ -q -i ~ 

ognition tends to occur. Lc,. in the vicinity of the class dWjcei=i-| i / I I -^q 

boundary is insuffident there is no guarantee for obtaining 35 " [ jJ*^ 1 I^J^ J J 

a metric for giWng an optimum discriminadon to an m ^adi of die above-mentioned cases, when a training 

mUm^ sample. Th^cforc. die trainmg mediod based on ^ suflidenUy greater dian one, the misdassifi. 

r H ^""^ .T^-^^^i"^^ 0* (3^; ^SO corresponds to a correct 

^lf^cl!J!^T^ conccnung classification, while die misdassification measure d, (x;e) 

ead^cbss is adopted In the pr^^^ 40 >Ocariesponds to an incorrect dassiflcatioa In odier w^^ 

n!?^r^f!f^^'' ^^f^^' theS^ification measure expresses a dass dedsion rule 

^^^h^^^^^ For simphaty, die . f ^ ^f a function, and^ smootfi concerning die 

above-inentiofled naming mediod is referred to as a Dis- 3 ^ differtntiable. Then, a s4odi 

^'''^JTfT^v^*' lu,^ ^ 0-1 loss is defined sudidurt it simulates die 0-1 loss of die 

A^6^l^^Z.^^^ Equation (3). THe 0-1 loss can be formalized in a variety Of 

rn discrmunant func^on mediod is ^ ^ ^ ^^^^ 

toperfonn nraining of a signal pattern recogmtion apparatus i, wLted in die pivsent ca«. 

mtended directiy to minimize die expected loss of die signal w^} » wiopwu m mc pt^m 

pattern recognition apparams, ie., die recognition circff ^^^^ ^ Wrfj*;©)) (30) 
pr(4>ahLlity. Acoos-ding to the present training method for- 50 

malization of a loss function ljk(x;e) and dw mediod foff iAt+«iiK-a(<'»(^'®)-?)aa>o 

minimizing the expected loss which is the objective function The loss function takes a value of zero when the misclas- 

are in^Kotant problems. The loss function is desired to sification measure d* (x;6)^0 with resped to a suffidently 

smartly reflect the risk of an action based on die classifica- great constant a and a constant p having a sufiBideatly smaU 

tion result of eadi classifier. In particular, when the training 55 absolute value, Le., in the case cf the oosred classification, 

of the minim u m enor signal pattern recognition ^iparams is or takes a value suffidently close to one when the misdas- 

die pispose or goal of die training, die 0-1 loss given by die sification measure (x;0)>O, ie., in the case of die incor- 

Equation (3) is consistent in carrespoadence with die rec- red classification. Thoefore, it can be understood that die 

ognition error probability. Howeyer. the objective function loss ftinction is a 8atisfact<»y qjpcoximation of die 0-1 loss 

based on die O-l loss is unsmoodi concerning the training 60 function, and consequently, the objective fiinction is a 

parameter 8. ijt^ not first-<adcr differentiate. Therefore, a sufficient apptoximatioo cf the recognition cam probability, 

gradient method which is an efficieiit optimizing cxiethod can Ptnthennare, the loss ftinction is smooth concerning the 

not be Implied, and it means diat the loss function is misdassification measure d^ and first-enter differentiaUe. 

inappropriate in a practical point of view. Therefore, as a and therefof e, the loss function is smooth also concerning 

loss function whidi can be more analytically handled, a 65 the parameter 6 and first-osxier difierentiable. By changing 

Pocqitron loss and a squared emr loss are often adopted the constant a, the approximation to the recognition error 

(See, for example, die Reference Docoraeiit 7). Each of die probability and die smoothness can be adjusted. 
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Here is now considered the mininiization proUon <rf the conforms to the following Equatioo (35), a parameter matrix 

objective functioD formed of the loss fimction formalized in {e^*^; t=l, 2, . . .) generated in accordance with the Equations 

the above-mentioned manno. The minimization method can (32) and (33) converges on at least a locally minimum point 

be divided broadly into two categories. As one methods there 6* of the expected loss L(e) with probability one. 

can be considered a mettiod for effecting the minimization of 5 

an cmimical average loss of the following Equatioo (3 1) on ^ 

ail ones of a set {x„;it=l . 2 N} of obtainable training ^i 

samples according to the gradient method such as a steepest 

descent method in a batch processing manner. ^| 



10 



i«<©)=4 E ( I /i(j«^©)iC%«Ci) I At the beginning, the above-mentioned rule is given only 

" \ k^i } g g statistic-dimensional signal pattern, however, 

{l^ifAismie 1 the rule has been expanded for a signal pattern whose 

O^ifitb^ 1 15 dimension dianges d^nding on a sample such as a ^>eedi 

signal pattau With the above-mentioned arrangement, the 

w ^ w « rtix t/ \ . Ai»<^*»n minimization of the cxpcdcd loss or the recognition error 

to the above Equauon (31), 1(.) «pr^cnts a function ^ probabiUty of signal pattern 

which takes a fiindioa vaKie of one cr zaa ^cognition, consistent smooS fonnalization of a loss 

t^knownoutor^e^ofmc^^^^ ^3^^-"^^^ 

'^.n^.fiJ^rJ^^r!^ Wm^nsofasetoftralSngs^ 

sampeisrwhiced^^^^^^^^ ^^^^ , ^ 

samples xs used, the difference bctwea^ the two m^ods is ^ ^| ^ v } (s=l, 2 K), i.e.. the center vector and 

«naU. However the latter ^^^f^^'^"^^ « are^ai^i according to L Minimum Oassiflcation 

has such "«^^l*^P?^*>*!*f ^^^^^^ Ema/Generalized ftobaWEstic Descent method based on 

adapttoanewuseconditionwhlchcuii^ STqu^c discriminant function given by the Equation 

training stage. According to the Mmimum Oasafication ^ VXou£^^on of r, and 

Error/Generalized Probabilistic Decent method used in die ^^^f ^Tas^rbecause of such'a restive condition 

present prefeired cmbodimcat of the p«sent "J'eirtion^ 30 ^'J Vis m ^ogonal matrix is imposed -H^crrfor., the 

foUowing adaptive minimization^^ or*ogOMll«trixV,isex^ 

to the adaptive mininuzation method, updatmg of the param- wiiuu^uu«i£iiia**** , , 

etci 0 is performed according to the following Equation (32) v^**^* 

follows at the time of a natural number i-di iteration. v;=u,^e^ij)£/, ^(e^ j) . u^u^^iJi 

^^^^ 55 In the above-mentioned Equation (36), the above d x d 

^^"^ ® ' ^ ' matrix Up^(e) (p<q) is given by the following Equation 

In Ihc above Equation (32), the parameter B^^ rq^esents (37). 
a parameter value at the time of the t-th iteration, A0(.) 
cqrescnts a correction value of the parameter, and x, and Q 
itpresent respectively a training sample given randomly and 
a dass to which the sample belongs. It is desired for the 
adjustment mechanism acc<»ding to the updating equation to 
be such that the signal pattern recognition apparatus is 
trained so that the expected loss reduces every time a new 
sample is given, and to be intended to r^[ulariy miminiztt (he 
expected loss. A mathematical basis of the above-mentioned 
requiremeats is given by a probabilistic descent theorem as 
follows. 

<Probabilistic Descent Theorem> 

When a given sanqiie x, t>eiongs to a dass the 

^'^^'T^^^^'^'^ Ae is set according to die ^^^^^^ n ,(9) is a d-dimenrional 

following Equation (33). ortiiogonal matrix in which tiie (K p) coim>onent and die (q, 

Ct, ©^>>a-^,rtVeWjv«^^>) (33) q) component represent cos9 respectively, Uie (p, q) com- 

55 ponent rqjrescnts sin8, the (q, p) component represents 

In die above-mentioned Equation (33), V0 represents a -sinS. and the others repiesent a diagonal conqwncnt of 1 

partial differentiation depending on tiie parameter 6, a ^ non-diagonal con^wnent of 0. The matrix Dpjfi) is a 

correction coefficient c, represents a sufitcientiy small posi- so-called Jacobd Rotation which Is an opwator for rotating a 

tive constant, and H represents a positive-definite mateix. In p-axis component and a q-axis component of a 

Che present case, the expected loss L (8) reduces in average ^ d-dimensional vector in a plane generated by die axes by an 

wifli respect to die sufficicady small value of e, in each ^ngle of 6 (See, for example, G. R Golub and C E Van 

iteration. In otiier wosds, die following Equation 34 is Loan, **Matrix Coii?>utations, The Johns Hopkins Univosity 

satisfied. Press, 1989 (referred to as a Reference Document 9 

hereinafter). The above Equation (36) expresses the fact that 
65 the oAhogonal matrix is expressed by the Jacobi Rotation 



40 



45 



SO 



COfO tBOS 



(37) 



E{t(e«H<e^'"»>)>SO (34) 



Rndicraiore, if an infiinite number sequence {x^ tsl, 2, . corresponding to axis pairs of aQ combinations. The number 
.) extracted randomly is used for be training and of die combinations is d(d-iy2 equal to the degree of 
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freedom of the orthogonal matrix and then, it is under- Therefore, it is required to set all die initial values of die 

stood Aat an angle (6,^^ is required to be adjusted instead parameter 9, to zero, respectively. With the above-, 

of cQiTecting the orthogonal matrix V, itself under the mentioned arrangement, the parameter 0, (s=l, 2. . . . , K) 

restrictive condition. With the above-mentioned to be adjusted can be o^essed by the following Equations 

arrangement, die parameters to be adjusted acccxding to the 5 (4Q through (48). 
Minimum Classification Error/Generalized Ftobabilistic 

Desoent method are of the foliowiog three types. p.=(PmPm W 

rMrsAr^--^s^^*if,^hZ...,K (38) • ♦-^f ««' 

♦-^❖^♦-^...❖wr**^*^!.^.. (39) W9ua«^iJ- m 

M^M.* • • • 9^1, J'^J**^^, 1=1, 2, .... ic (40) Then initial vahies of them are given by the following 

Equations (49) through (52). 

(3-4-3) Initialization of Discriminative M^c Design 

A mediod of optimizing die Minimum Qassification is p^^y^^^ (49> 
Enor/Generalized ProbaUlistlc Descent method is based on 

the gradient method, and therefore, the parameter matdx ^ , 

converges on a variety of locally-cptimal sohitions depends ^ T*i 
ing on the initial value thereto. Therefore, it is prefeired to 

provide an initial value which is assbnilated to a gldbally- 20 

optimum solution as dose as possible. As an initial vahie ^^^^p<^i p<=i,2,. , . , 6-1,- q=2,y^. . , d (Si) 

r/°^ of r^ it Is prcf ened to give a dass sample mean vector ^^^^ 

in a maimer similar to that of the n<amal distance ' ~^ ^^^^ 

iiieasure. A«taitl«l values V/"') of metric, tfaeie can (3^) Derivatioa of DMD updating equation 

be cooadend a uiJtoiMy wdghtlng and « Out«ian coor- 25 Minimum Oassiflcaiion Enxjx/Generalized Proba- 

dmale system (<»,<°>=I. V,«»=D or values (<b,<^\:f^. V W (.ilistlc Descent method, a panmeter oonection value when 

=E,)accOTAng to the above-mendoned principal component a twining sample x, whose class of correctness isQisgiven 

ana^ys^ The foimcr corresponds to the EucUdean distance. by the Equation (33). When the sigmoid function is adopted 

while the 1^ ccmrespoods to the Gaussian discriminant ^ function. d>e gradient of the loss function is 

function or the Mahalanotas distance. In particular, smcetiie 30 expressed by the following Equations (53) and (54) acconl- 

lattcr IS a vahie obtained by the piiadpai component ing to a chain role, 
analysis, the value can be considered to be a statistically 

satisfactory initial value. Tboefore. the present prefened V9l,iK9)=^,/iJ^j76d^xi&) (S3) 
embodiment of the present invention uses the lattet 

However, in the case of initialization by the pindpal 3J V«;,(*;e)=<ii,(»;©){l-'rf«©))7ei/»(«©) (54) 

eompooent analyas. die initial settm cf the paaine^ 8, .bove-tnenUoned Equations (53) and (54). VOd, 



the dass C, is rotated by the initial value \J^\ ^^^^ eHVe,rf*(;«;e) V©^x;e) . . . sfQMx^)\ (55) 

siK o,KV,«^*^jnKA'»;'(V,^*'*-f>ta«drt(*/4^^^ (41) ve/«*;eHU*;0X3r(';Oj^A *=ip 2, . . . . if r56) 



PUJir«) = 



1 



(57) 



(♦3) 



53 Further, the gradient of the discriminant function g(x;B,) 
is given by tt» following Equations (58) through (62). 



In the above>nieiitioned Equations (41) through (43), W, 
is apparently an orthogonal matrix, and therefore, the 
orthogonal matrix is expressed by the following Equa- 60 
tion (44) in a manner similar to that of the Equation (3<5). 

K=VUB^^:^U^^^^S> • • • V^U^^^ (44) 

In this case, an initial value of the orthogonal matrix 



^r0,y^.^tw/>.^wj{y.^^j^.) (5S) 

^S(*•.eJ^^2«^^'(V,<^Vp.)P^28/M«♦J^} (59) 



can be expressed by die following Equation 45. 

W,^V/^/>>=l (45) 



65 



As a result as an updating equation of the parameter 
^/^KP/* 6J (s»l) 2, . . . , K) in the Disoiminative Metric 
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DcsigD. tilcre can be the foUowing obtained Equations (63) VBMx^^>^a^^'-*>)i^9M.'^'^ i^) 

througtk (65) from the Equatioo (32), the Equation (33) and the abovc-recntioaed Equation {67). Q/Jx^^^^ is 

the Equadoos (54) through (62). expressed by the Equation (57). On the other hand, Ve^ 

(x^-B^*^) is expressed by a matrix of a oohimn vector 

pr = /'r'^+ (63) 5 composed of dg(\:^yBp^ M^^.V^c ^Ui^r^.V^, 

^ w ^^r, ,/ ^ihi^ i oo-i>xw When e^/'"^^ tt is to be noted that ag(x^e,ydp, is 

2./,i^^e^^i-Wx.-ec-')W^ec-»>)x expressedby the Equation (58). while dg(x,-e,ya^, is 

^K»-'j^<»->)«v>HK'-'J'(i{^j^ . pT'-'^) expressed by a matrix of a ooiunui vector composed of 

♦3= ♦Sj*^ - ^®*> 10 exi»csscd by the Equation (59) . Rmhcr, ag(x^e,>'de, is 

dg(Xp*0,)/de,.i.2, dg(x^*©,)/e,,i.3 og(x^©J^de,^i,rfm 

tiJr»WSf«''(Vf«j^ - _ such a condition where p<q, and p and q arc within a range 

^ gvu of 1, 2, . . . , d in die Equation (61). 

&m + 0)H j3 FUrtiiex. at the Stq) S7, it is decided whetho- or not a 

t ^ 1, 2, . . . , ^ decision conditioD is satisfied. When the decision condition 

is satisfied, it is decided that the adiq>tation of the parameter 
♦SU ' ^ ' training is completed, and the parameter training operation 

2t,a/»(z^©<^^ 1 - <*(v©^'*^}<W'r*^*) ^ " con^ctcd. Otherwise, when the decision condition is not 

^ 20 satisfied, a numeral of one is added to the iteration parameter 

(K^rii -pT'^^ ^wm,^'^^^/'"^ t to update the parameter at the Step S8, and the program 

flow proceeds to the step S2 to repeat the above-mentioned 
operation. In regard to Ae decision condition, the operation 
may be completed at the time when the iteration frequency 
In the above-mentioned Equations (63) througli (65), the t reaches a predetenmned frequency. <v Reoperation may be 
supersoipc t represents the number of repetition steps, while completed at the time when the degree of reduction of the 
Ae supencript exjfl rqnesents a square. On die other hand. average loss coDCcming the set of the training samples 
Wx^O^*) is obtained from the Equations (29) and (30), becomes smaller than a predetermined value. In a simulation 
p^(x^e';'>) is obtained from the Equation £7)^ which will be described in detail below, the fccmcr U 

W i^^^'f-^"^ 30 adopted, 

dje Equation (62). E is to be herein noted diat die weigjiting /dCsimulation 

positive-definite matrix H of Ac ^nation (33) is used as an ^ Nation of recognizing basic speaker-lndcpendent 

S^JStSLiTS^^^ lapanesefivevoweU^r;^^^ 

K^chparametercanbe^eved.Acen^^ P*««?, recogmtion w*rams of the present preferred 

e^ dass hoLned by adjusting the parameter p. An ^'^^f 3^^ ^J^?^ ^^^'i^^jfSl" T 

^SnS^ oi^onarmal Js V, of eadi clTis obtaii^ by verify the feasibmty of the Dilative Meto^ 

aSS the parameter 6, ^ optimum weighting param- comparison with the conventional gen«ic signal pattern 

^^Z JaS^^ ^ o^^ZBij^hy Adjusting recognition apparams based on the Mahalanobis distance 

ivi mui , J B measure aiul an LVQ signal paoeni rccogmtioD apparatus 

no. 7 shows a paran»ter ualning process wUch is the « r^««tative ofAcmuiti-tcnvl^^Umc^^, the 

badcalgorithm of ftTMsciinunalivl Metric Design even- effectiveness of the D^scnmi^vc MetncDesxgn adopted 

wecuted by the parameter training «.ntroU« ». ^ ««« F^*'"" prefored embodiment wfll be coaoborated. 

Mro^tonG.7.teay.tthea5si,initiaIizationof . ^ ^ Pjr^t^^"^ ' ^^"^ s«^P„*r»r'l«' 

rSef ir'sS>S''^a£^L"x.'rS:l« '-f-^.a.- juu^g 512 isda^ wonis spoken by 70 

S has bei^ known U«aacted nnda^-n.^. at the ""'^f^t^^ l^I^'^^t tTr^.l 

SteDS S3 S4 and S5 the values of irspeotfve paramctexs are ft«jnency of 12 kHz in 16 quantization bits. A hamnung 

SS;^^t^<^^tSsa^^^^r^o>b« windowistakenl^multiplicationoutof .frames 10«uec 

^wTTtteStep S3, the value of . disSunt function » about a center portion of «^ vowel s.^nt ue a frame 

a^L^" ScutatedacoorfingtolheEquation (15). At »>»ving a frame length of 20 msec., and then subverted to an 

E'sS* S4 a^SdSST^ure^x^ 6^-') is LPC cepstimn analysis m resulting ostium coefficient is 

ok^Wdlng to the Equation (29). ATSJ sup S5. a «« ^kcn- or an inputted signal pattem for 

Sn??>r3-^ .«SIg to *e Equatin (30). «>e si^ pattexn -^f^^^ "Waratus Aoc^g «^e 

RnthttAI die Step S6. the panmirtae is iBdated accord- « simulauon, an LPC d^ee and a cepsoum depee 

S?U» 4VfSoSg^aL^ ?« ?«* 32- 1» oA«; words, each inputted token is a 

Irr». /r)\ nvi.1,A/tA\ 32-diinensionaI single LPC ccpstnim ^)ectrum. Data of 50 

tions (32). (33) ana (M). ^^^^ ^ ipeatets are used as a set of the 

eM=©<'-<)-,^o !,"-•>( i-/i"-'>)7eii(ji:e>-'^ m nlnlng samples fcrihe signal patten recognition apparatus. 

Id the above-mentioned Equation (66), the pamneta-e is 60 and two groups each conqwsed of 10 persons out of die 

expressed by a matrix of a column vector conqmsed ci 6,, remaining 20 persons are used as two types of test sets. Tfest 

e, e<t, while Ihepaiameta ©, is expressed by a matrix recognition resuUs of the two types are ref eired to as a "ftst 

(rf a colunm vector composed of (^,. 6. (s=l, 2 K). 1 and a Test 2, respectively. The number of die (raining 

On the otha hand, VedtCx^e*"'*) is eiqiressed bv a matrix tokens is about 7500, and the number of tokens in each test 

(rf a column vector composed of VOidtH^i e'^'O. VSA 63 set is about 1500. A training process for optimizing the 

VeACXn"©^*"")- Further, VeA(x«6***'0 parameters of the signal pattern tecognitioo qiparatus is as 

is caipiessed by the foUowing Equation (67). follows. 
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First of all, speech waveform data cxxrespoaduig to 
re^>eaive vowels is taken out of the 512 trainiiig-use 
isolated wcrd speech data d 50 persons, and informatioD of 
the relevant class. i.c.« any of the five vowels is incorporated 
to the data to form a training-use speech wavcf omi data set 
Thereafter, each speech wavefonn of the training speech 
waveform data set is inputted to fte feature extractioo 
section 201 of the signal pattern recognition apparatus of the 
present preferred embodiment of the present invention. The 
feature extraction section 201 converts each speech wave- 
foim oi the inputted training-use ^)eech wavefonn data set 
into a 32-dimcnsional LPC cq)stnmi coefficient spectium 

(nsl, 2 N), and ou^ts the resulting data to the signal 

pattern training coDtroUer 20. Then, the signal patton rec- 
ognition apparatus is set to the training mode, and the signal 
pattern training controller 20 executes the parameter training 

process as shown in FIG. 7. In the present case, s^l, 2 

5. and K=5. After the training process is conqileted, the 
sigaal pattern lecognitiott apparatus is set to the recognition 
mode. Then the feature extraction section 201, the feature 
converters 10-1 through 10-K, the discriminant function 
calculators U-I through 11-K. and the selector 12 are made 
to operate, thereby executing the sigaal pattern recognition 
process on the training set as well as die Test 1 and the Test 
2, Table 1 shows a result of the simulatioa In Table 1 are 
shown error probabilities of speech recognition executed by 
the signal pattern recognition af^aratus of the present pre- 
ferred embodiment provided with the classifier which uses 
the Discriminative Metric Design based on the quadric 
discriminant function (the classifier including the feature 
converter and the discriminant function calculator of the 
present preferred embodiment will be referred to as a DMD 
das&ifier hereinafter), &e conventional Mahalanobis dis- 
tance classifier, and the conventional LVQ classifier having 
a plurality of templates based on the Euclidean distance. 

TABLE 1 

Enor probabilitiM of speech reoo^tkm m tttks of 
fM>«wc five yqweb 



10 



15 





Kt 


Ibitl 


Test 2 


DMD classifier 


3.84% 


8.78% 


10J9% 


MahalaDObb ifistuce 


8.80% 


13.10% 


13.42% 


clasaifler 








LVQclaisifier 


10.51% 


10.93% 


15.62% 


(with one teaaphte) 








LVQdiffiifier 


S.00% 


14J8% 


12.93% 


(with eight twaphtee) 








LVQcbMiSer 


3.46% 


16l41% 


13.40% 


(with 16 teoophies) 









As is apparent from Table 1. the DMD classifier of the 
present preferred embodiment exhibits a speech recognition 
accuracy higher than any of the conventioaal Mahalanobis 
distance classifier and the conventional LVQ classifiers. 
Flirthennore, it can be understood that, even when one 
tenqilate is used, the DMD dasslfter of the present preferred 
embodiment which provides a unique feature metric space 
for each class increases Ihe robustness of the signal pattern 
recognitioD apparatus in oosnparison with the case where the 
number of templates is inaeased in die conventional LVQ 
classifier. 

(5) Difference between the Discriminative Metric Design oi 
die present prrfened embodiment and the other methods 

As described herethbefocc in the item of (3-3) the metric 
training by flie principal conqxinent analysis, a method for 
peif oiming the princq>al component analysis of the set of the 



training samples in reqiective classes can be enuiDer8:ted as 
the siiiq)lcst method in regard to the metric training. In the 
case of a classifier based on the quadric discriminant 
function, the Gaussian discriminant fimction or the Mahal- 
anobis distance can be enumerated as the representatives of 
the disaiminant function trained according to the metric 
training method based on the principal CQnqx>nent analysis. 
However, as described hereinbefore, the prtncqial compo- 
nent analysis is insufficient in tarns of achieving the cssot- 
tial object of minimizing the recognition error of the signal 
pattern recognition apparatus since the training is pcrfonned 
independendy In each class and the metric training is 
achieved witiiout taking into account the infiuenoe of the 
other classes. The above-mentioned fact has been couobo- 
rated by the above-mendoned simulatioo. 

Recently, there is a growing trend of recognizing a 
Gaussian distribution type Hidden Markov Model (HMM) 
speech dgnal pattern recognidon apparatus based on the 
training according to the Minimum Classification Erroi/ 
Genaalized Probabilistic Descent method as a method for 
20 achieving a hi^-accuracy speech recognition (See. for 
example* W. Ghou. B. H. Juang. and C. H. Lee. ''Segmental 
GPD training of HMM based speech recognize. I^oceed- 
ings of ICASSP 92. Vol. 1. pp. 473-476. 1992 (referred to 
as a Reference Document 10 hereinafter) or D. Raintoo and 
S. Sagayama, '"Minimum error classification training of 
HMMs — iD^)lementation details and experimental results*". 
Japanese Acoustic Society Japan. (E). Vol 13. No. 6, pp. 
379-387, November. 1992 (refened to as a Reference Docu- 
ment n hereinafter). CooventionaUy, there is generally used 
a mixture Gaussian distribution type HMM which can 
handle only the Gaussian distribution of the diagonal cova- 
rianoe matrix in order to allow the i^lication of a Gener- 
alized Probabilistic Descent method, and assigns a plurality 
of diagonal oovarianoe Gaussian distributions to each class 
in order to perform modeling of a signal pattern variation. 
Eventually, the above-mentioned apparatus is substantially 
equal to the multi-template distance classifier (LV(^ which 
assigns a plurality of reference vectors to each class. 
However, as described hereinbefore in the items of (2-1-1) 
40 Maximum a posteriori probability decision method and 
(2-1-2) Discriminant fimction method, an increase of Ihe 
number of reference vectors having a single Gaussian dis- 
tribution and the robusmess with respect to an unknown 
sample are in a trade-off relationship. Theiefm, it is difficult 
to decide the optimum number of reference vectors for the 
rccognitioD of an unknown sample. As is easily presumed, 
the Discriminative Metric Design based on the quadric 
discriminant fuocdoo can be substantially applied to the 
Gaussian distribution type HMM classifier t>ased on the 
50 quadric discriminant function. In particular, die arrangement 
that die orthogonal matrix V^ which is a class-feature axis 
can be adjusted allows die optimizadtm according to die 
generic HMM Minimum Classification EcroryOencralized 
Probabilistic Descent method having a Gaussian distribution 
based on a generic covarianoe matrix, not limited to die 
diagonal matrix. The above-mentioned simulation result 
indicates the fact that the means for training Ihe unique 
feature metric space of each dass aocoffding to the Discrimi- 
ludive Metric Design Inqxroves die recognition accuracy 
concerning an unknown sample in comparison with the 
means for assigning a plurality of reference vectora. 
IVesumably, the above-mentioned fact provides a new 
knowledge in considering the robustness of die HMM 
speech signal pattern recognition qjparatus with respect to 
an unknown sample. 

The linear transfonnadon according to the Discrimi- 
native Metric DecigD executed in each class can be also 
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rcgaidcd as a fcatuxv extraction process. Arcccntly proposed and corresponds to the reference vector of the Discrimina- 

disoiminativc fcaftire extraction method (Sec A. Biem and tive Metric Design method (or the LVQ classifier). On the 

S. Katagirt* "Feature extraction based on minimum classi- other hand, the Discriminative Mrtric Design of the present 

fication error/generalized probabilistic descent method", prefencd embodiment handles the class-feature axis as a 

Ftoceedittgs of ICASSP 93. Vol 2, pp. 275-278, AjHil, 1993 $ mailing from the original signal pattern space to each class 

(rcfarcd to as a Reference Document 12 heicinafter) is a feature mcttic space. In other w<Hds, only the class template 

method for synthetically training both the feature extraction is trained according to the jrojcction-based classification 

process and Ae dassifying process according to the Mini- mediod in a manner similar to that of the LVQ classifier. In 

mnwi Classification Eirai/Geocralized ftobabilistic Descent contrast to the above, both of the template and die feature 

method, thereby solving a mismatch between botii the lo metric space are trained according to the Discriminative 

processes for the original puxpose of mini'miring the rccog- Metric Design of the {Bcsent prefcired embodiment of the 

nition etror of the recognizer. However, in contrast to the present inventioa. 

fact that a common feature space is given to every class (<$) Advantageous Effects 

according to the discfiminative feature cxttaction method, a Acccvding to the present preferred embodiment as 

unique feature metric space is provided for each dass is described above, there is adopted the Discriminative Metric 

accOTding to the Discriminative Metric Design used by the Design which is a method for training the feature metric 

present preferred embodiment F^utbermore, according to space fat effectively expressing the unique feature of each 

die Rcfciencc Document 12, the optimization of only Ihe dass, or the noetric of die unique discriminant function of 

weighting of each cepstrum coeffident is performed. each class in a discriminating manner, or so that die recog- 

However, according to die Discriminative Metric Design, 20 nition enor is reduced as a new method for training a signal 

not only die weighting but also die orthogonal ajces of the pattern recognition apparatus with high accuracy for an 

feature metric space can be adjusted. In the above- unknown san^e different from die training sanq>les of die 

mentioned point, die Discriminative Metric Design is dif- signal pattCTn recognition qjparatus. Tberclm. in contrast 

ferent from the discriminative feature extraction mrthod. to die fact that the similarity evaluation has been performed 

In a manner similar to diat of die Discriminative Metric 23 in a feature space common to every class, the evaluation is 

Design, dierc are proposed a Sub^cc mrtbod (See die performed in a um'quc feature metric space dut is provided 

Reference Document 1), a Multiple Similarity mediod rcl- for each class and expresses die feature of each dass in die 

evant to the above-mentioned mediod (Sec TCrou Ejima, iM"cscnt prcfdrcd embodiment. Therefore, die variation fac- 

*Tattem recognition theory**. Morikita Shuppan. 1989 tor is suppressed even for an unknown sample* thcrdiy 

(referred to as a Reference Document 13 hereinafter), a 30 improving die recognition capability. Furdiermcae, die 

Learning Subqjacc mediod (Sec die Reference Document structure of die present preferred embodiment is rdativdy 

1). a Con^jound Similarity mediod (Sec the Reference single. By adopting the Minimum Oassiftcation Errot/ 

Document 13) arc proposed cadi as a classifying method for Generalized Probabilistic Descent mediod of wUdi cffec- 

dassifying a signal pattern by providing a unique feature tiveness as a signal pattern classifier training mediod 

space fa- cadi dass. The above-mentioned four mdhods arc 35 intended for maximizing die recognition rate has been 

to assign a "Sub^cc" representing each dass feature to known, a discriminant function metric really essential to die 

each class, and adopts an •'orthogonal projection'* of an recognition can be obtained, 

inputted signal pattern to each Subspace as a discriminant (7) Modification examples 

function. It is to be noted diat, according to die Multiple Altiiough die signal pattern recognition ^jparatus to be 
Siimlarity mediod and die Con^>ound Similarity mediod, 40 applied to a speech recognition apparatus has been described 
wdghting is effected on each basis vector of each Subspace. in die above-mentioned preferred embodiments, die present 
Herda fcr siraplidty, die above-mentioned four mcdiods invention is not limited to diis, and die signal pattern 
are generally refcired to as a projection-based classificatioa recognition qjparauis may be applied to a diaracto- rccog- 
Qi^cthod nition apparatus or an image recognition apparatus. It is to 
The Disoiminative Metric Design and die projection- 45 be noted that, in the case of the character recognition 
based classification method fundamentally differ from each apparatus or die image recognition ^)paratus, instead of die 
odicr in the type of inputted signal patterns to be handled. microphone 2<W and the feanirc cxtractioo section 201, it is 
The projection-based classification metiiod whidi uses die required to povide an image scanner for converting a 
ordiogonal projection as a discriminant function can be character which is handwritten, typed, or printed on a shc« 
^lied only to a signal pattern whose class does not change » of paper, or an image such as an image signal pattern, into 
if me signal pattern is multiplied by a fixed number. On die a dot image signal pattern and ou^utting die converted 
other hand, die Discriminative Metric Design based on die signal pattm to die feature converters 10-1 dmxigh 1*-K 
distance function can handle arbitrary types of signal pat- and die signal pattern training controller 20. In die above- 
terns by making ofdionGrmal a signal pattern as described mentioned case, after training die apparams widi a predc- 
d>ove by die size. In odier woards, die projection-based 55 termined character or image signal pattern {vovided for die 
dassification mediod can be sppUoA to an original spccdi training, a diaracter recognition or an image recognition can 
waveform, an image signal pattern* a power spectrum, and be performed. 

die like, however, die method can not be ^licd to a It is to be noted diat instead of providing die feature 

logaridunic power spectrum, a cqstrum, an LPC coeffident, extraction section 201 in die q>eech recognition apparams, 

and the like. In contrast to die above, the Discriminative 60 audio dato obtained by converting a speedi signal waveform 

Metric Design adopted by the present prefened embodiment obtained in die microfriione 200 dirough ao analog to digital 

can handle any of die above-mentioned signal patterns. conversion process may be directly used as an inputted 

Furthermore, the Discriminative Metric Design and die signal pattern. 

projection-based dassification method differ firom each Although die feature extraction section 201 converts die 
odier in die handling of the dass-feature axis. According to 65 iiqwtted speech signal into a 32-dimensional LPC ccpstnun 
die projection-based dassification mctiiod, the dass-feature coeffident vector in the abovc-mcotiooed preferred 
axis has a «8enifiriint meaning as a tenqiate of each dass, rmbodiment, die present invention is not limited to dus. In 
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Che case of the speech recognitioo q)paratU5, the sigoal may through said calculation e^iecuted by plundity of 

be coQvcrted into another speech feature parameter such as discriminant function means; and 

a power spectrum and a logarithmic power specuum to be Gaining control means for training and setting said plu- 

"S^r^^^ function is used as a 3 Sc^T'^^^^r^^ 

disciiminamfunrtion representing the sirnilaritymeasw IriiMfora^ 

eadi class in the abov^mentionL preferred ^Smenl^ nant factions so that an «^pro^ 

the present invention is not limited io this, and a discrimi- rccogmtion is minimized based on a predetcr- 

nant function as foUows may be used instead . trammg signal pattern. 

(a) A disaiminant function which gives a pluraUty of ,o recognition appamus as claimed m 
center vectors to each class and represents a distance ^ 

between an inputted signal pattern and each of the center wherein each of said pluraUty of feature transfomiation 

vectors in a manner similar to that of the Reference Docu- means linearly transforms die ixxputttd signal pattern 

menl 4 is used. vectors in said plurality of feature ^ccs corre- 

(b) A discriminant function which represents an angle sponding respectively to said classes by projecting the 
between vectors of an inputted signal pattern and a reference inputted signal pattern onto a predetermined basis 
signal pattern such as a feann-e parameter in a natural multq)lying a resulting vector by a predc- 
number n-dimension is used. tcrmined real number. 

(c) A discriminant function representing a likelihood, ie., P*"*" recognition apparams as claimed in 
a density function value of an inputted signal pattern is used. ^ 

In the above-mentioned prcfared embodiment, the selec- wherein cadi of said pluraUty of discriminant functions of 
tor 12 outputs the infonnadon of the class conespooding to discriminant function means is a predetermined 
the discriminant ftinction calculator which output the mini- quadiic discriminant function rqxesenting the similar- 
mum discriminant function value among a plurality of K measure of each of said classes, 
discriminant function values. However, the present inven- 25 1' ^V^^ pattern recognition apparatus as daimed in 
tion is not limited to this. In a case where the similarity claim 1, 

measure of a dass is high when the discriminant function wherein said training control means perfuiius adaptation 

value is great, the selector is arranged so that the information pluraUty of transformation parameters of said 

<^ tiie dass ocnresponding to the discriminant function feature transfonnotion process and said pluraUty of 

calculator which ou^uts the maTimiitn discriminant func- 3^ discriminant functions of the disaiminant function 

tion value among a plurality of K disdiminant function means, so that the error probabiUty of said signal 

values is outputted as a classification result patten recognition is minimized^ based on said ptede- 

Ahhough the present invention has been fuUy described in tcrmined training signal pattern, by means of an adiq>- 

connection witii the prefcned embodiments tiiereof widi tive minimization method utilizing a probabilistic 

reference to the accompanying drawings, it is to be noted 3^ descent theorem. 

that various dianges and modifications arc apparent to those 5* The signal pattern recognition apparatus as daimed in 

sldUed in die art Such changes and modifications are to be claims 1. further con^xising: 

understood as included within the scope of the present signal conversion means for converting an inputted 

invention as defined by the appended daims unless they speech into a speech signal and outputting the speech 

depart therefrom. ^ signal; and 

What is claimed is: feature extractioa means for converting the speedi signal 
1. A signal pattern recognition apparatus for dassifying an outputted from said signal conversion means into a 
inputted signal pattern into one of a pluraUty of predeter- predetermined speech feature parameter, and output- 
mined classes so as to recognize the iqnitted signal pattern, ting (he obtained feature parameter as a signal pattern 
con^sing: to said pluraUty of feature transformation means and 
a pluraUty of feature transformation means for respec- said training control means, thereby recognizing the 
tively transforming the inputted signal pattem into inputted speech. 

vectors in a phiraUty of feature spaces corre^nding 6. The signal pattem recognition apparatus as daimed in 

respectively to said classes by executing a feature claim 5. 

transf(Hmation process by means of a predetermined so whcrdn said feature extraction means transforms the 

transformationparameter ccsrespondingtoeachof said speech signal ou^[>utted from said signal conversion 

classes so as to emphasize a feature oi each of said means into UPC c^stmm coefficient vectors through 

classes* said feature transformation means being pro> linear prediction analysis, and ou^tting resulting vec- 

vided respectively for said pluraUty of classes; tors as a signal pattem to said pluraUty of feature 

a plurality of discriminant function means fcr respectivdy 55 transfomation means and said training control means, 

calculating a value of a discriminant function by means 7. The signal pattem recognition apparatus as nnimr^ in 

of a predetcimined discriminant function representing a claim 1. further comprising: 

similarity measure of each of said classes for said image conversion means for converting a character into 

vectors in said plurality of feature spaces which are dot image data, and outputting the dot image data as a 

transformed by said pluraUty of feature transformation 60 signal pattern to said plurality of feature transformation 

means, said discriminant function means being pro- means and said training coiirol means, thereby recog- 

vided respectively for said pluraUty of classes; nizing the character, 

sdection means for executing a signal pattern recognition 8. The signal pattem recognition apparams as daimed in 

process by sdectiDg a dass to which (he inputted signal daim 1, furtha oon^sing: 

pattem belongs based on the values of said pluraUty of 65 further image coDversion means for converting an image 

discriminant functions corresponding respectively to into dot image data, and outputting the dot image data 

said classes, said discriminant fanctioDs being nHainrd as a signal pattem to said pluraUiy of feature transfor- 
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mation means and said tiaining control means, theretyy wticrein eacb of said dis m ' m i n a nt functions is a prede- 

recognizing the image. tennined quadric discriminant fiinctioD rqs'esenting 

9. A method for das sifyiog an inputted signal pattern into the similarity measure of each of said classes, 
one of aplurality of jHedctenmned classes so as torecognize 12. The method as claimed in daim 9, 

the inputted signal pattern, induding the following steps of: 5 wherein said training stq) indudes a step of pcrfcraiing 

transforming the inputted signal pattern into vectors in a adaptation of the transformation paramctj^^^ said 

ansioraung inc uapuucu pauau muiai y»^«.» a feature transformatioD process and said discriminant 

pluraUty of feature ^>ace$ cona?»nding n^pectivdy functions, so that the cSror probabiUty of said signal 

to said classes by executing a featnre tranrformation ^ recognition is minimized, based on the prcdc- 

I^ss by means of a predetenmncd transfonnation xammtd training signal pattern, by means of an ad^ 

paramcterMnespondingtocatAofsaidclasscssoasto io minimization method utilizing a probabilistic 

emphasize a feature of each of said classes; ^j^^^ meorcm. 

calculating a value of a discriminant function by means of 13 1^ signal pattern recognition apparatus as claimed in 

a predetermined discriminant function rqjrescnting a i fuithci comprising a feature extraction section for 

similarity measure of each of said classes for said receiving a first signal from an input device and for pioduc- 

vectors in said plurality of feature ^>aces which are ^ ii^juttcd signal pattcnu said feature extraction 

obtained through said feature transfoimadon iH'ocess; section extracting vectca^ of features parameters from said 

executing a signal pattem recognition process by sdecting first signal sudi that data representing said extracted vectors 

a class to whid) the inputted signal pattern bdongs are induded in said iiq^uttcd signal pattem. 

based on (he calculated values of said plurality of ^ 14. The signal pattem recognition apparatus as claimed in 

discriminant functions conresponding respectivdy to claim 1. wherein said training control means is coonected to 

said classes; and recdvc the inputted signal pattem. so as to train and set said 

training and setting the transformation parameter of said pluraUty of transformatioa parameters of said feature trans- 
feature transformation process and each of said dis- fonnatioo process and said pluraUty of discriminant func- 
criminant fuoctloas« so that an eiTor probability of said 25 

signal pattern recognition is minimized based on a 15. The signal pattem recognition apparatus as claimed m 

predetermined training sigoal pattern. claim 1, whcreio a unique feature metric space is provided 

10. The method as daimed In claim 9. wherein said for each dass. 

transfOTning stq> indudes a step of Unearly transforming 1^ The method as claimed in claim 9. further conning 

the inputted signal pattern into vectors in said phmdity of 30 step of: 

feature spaces corresponding re^>ccUvdy to said dasses by extracting vectors of feature parameters from an initial 

projecting the inputted signal pattern onto a predetcnnined signal and producing the inputted signal pattern based 

basis vector and multiplying resulting vectors by a prede- upon the extracted vedors. 
teimined real number. 

11. The method as daimed in claim 9. ♦ * « * * 
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